Background

As a macro social worker, establishing an understanding of any community you work in or with is imperative, particularly when you are not from that community. Without the appropriate community knowledge, efforts to catalyze positive change will be, at best, ineffectual, and at worst, detrimental. Though “boots on the ground” is always the best method for truly getting to know the community, data are powerful compliments to stories of lived experience, particularly when depicted in attractive and engaging ways.

This semester, my goal was to get comfortable navigating and utilizing data from the U.S. decenniel census and American Community Survey in a variety of data visualization programs. The following is a portfolio of that work.

A Note About Predictors and Consequences

Throughout the semester, I utilized various U.S. Census and ACS data to characterize my communities of interest, including locales in Michigan and California. As I did not perform any statistical manipulations of the data, the variables that I chose stand alone, and I cannot speak to the relationships between them with any type of legitimacy or confidence. As I continue to develop my knowledge around data visualization and analysis, learning how to appropriately apply statistical analyses to understand relationships between predictors and consequences will be a priority.

Data Visualization in R

We began the semester by exploring the functionality of the open-sourced software, R. As I spent three days a week in Detroit for field and class this semester, the differences in Detroit and Ann Arbor, and more broadly, Wayne and Washtenaw Counties, was often at the forefront of my mind. Using the census API, I was able to call data to compare community characteristics in each county. This data was depicted using base R and other packages, including ggplot2 and dplyr. To see the code that went in to each graph, click the “Code” button on the top-right of each tabbed section.

Each of these graphs pulls Census data at the county level. It is important to note that this data is not necessarily representative of Ann Arbor and Detroit, but instead the entire Washtenaw and Wayne Counties. In order to truly compare these locales, it would have been more effective to pull data from county subdivisions, voting districts, or census tracts to access a deeper level of granularity.

Base R

library(tidycensus)
library(tidyverse)

RaceWAYNE<-get_acs(geography = "county",
                      variables = c(
"B02001_002",  
"B02001_003", 
"B02001_004", 
"B02001_005", 
"B02001_006", 
"B02001_007", 
"B02001_008"),
                      state = "mi",
                      county="163",
                      year=2017)

RaceWAYNE$NAME<-'Wayne'
RaceWAYNE$estimate_pct = (RaceWAYNE$estimate / sum(RaceWAYNE$estimate))*100

library(RColorBrewer)
coul <- brewer.pal(7, "Blues")

barplot(RaceWAYNE$estimate_pct, main="Racial Groups in Wayne County, MI",
        ylab = "% of Total Population", 
        names.arg=c("White", "Black/\nAfrican \nAmerican",
                    "American\n Indian & \nAlaska Native",
                    "Asian", "Native\nHawaiian &\nPacific Islander",
                    "Other", "Two or More\n Races"),
        col=coul, las=2, cex.names = .73)

par(mar=c(18,6,2,1)+.1)

Ggplot2 Paired Bar Graph

library(tidycensus)
library(tidyverse)

RaceWASH<-get_acs(geography = "county", 
                  variables = c(
                    "B02001_002", #White 
                    "B02001_003", #Black
                    "B02001_004", #American Indian and Alasks Native
                    "B02001_005", #Asian
                    "B02001_006", #Native Hawaiian and Pacific Islander
                    "B02001_007", #Other
                    "B02001_008"),#Two or more races
                  state = "mi",
                  county="161",
                  year=2017)
RaceWAYNE<-get_acs(geography = "county",
                   variables = c(
                     "B02001_002", #White 
                     "B02001_003", #Black
                     "B02001_004", #American Indian and Alasks Native
                     "B02001_005", #Asian
                     "B02001_006", #Native Hawaiian and Pacific Islander
                     "B02001_007", #Other
                     "B02001_008"),#two or more races
                   state = "mi",
                   county="163",
                   year=2017)
RaceWASH$NAME<-'Washtenaw'
RaceWAYNE$NAME<-'Wayne'
RaceWASH$estimate_pct=(RaceWASH$estimate/sum(RaceWASH$estimate))*100
RaceWAYNE$estimate_pct = (RaceWAYNE$estimate / sum(RaceWAYNE$estimate))*100
RaceBOTH<-rbind(RaceWASH,RaceWAYNE)

vcounty<-RaceBOTH$NAME
vrace<-c("White", "Black/\nAfrican American",
         "American Indian\n and Alaska Native",
         "Asian", "Native Hawaiian\n and Pacific Islander",
         "Other", "Two or More\n Races","White", "Black/\nAfrican American",
         "American Indian\n and Alaska Native",
         "Asian", "Native Hawaiian\n and Pacific Islander",
         "Other", "Two or More\n Races")
vPercent<-RaceBOTH$estimate_pct
GroupedData<-data.frame(vcounty,vrace,vPercent)

library(ggplot2)
library(ggthemes)

ggplot(data = GroupedData, aes(x=vrace,y=vPercent,fill=vcounty))+
  geom_bar(alpha=.6,colour="black", stat="identity",position = position_dodge())+
  theme_few()+
labs(title = "Breakdown of Racial Groups",
     y="% of Total Population")+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  theme(axis.title.x = element_blank(),
        legend.position = "bottom",
        plot.title = element_text(face = "bold",hjust = ".5"))+
  theme(axis.title.y = element_text(face = "bold", size=11))+
  theme(legend.title = element_text(size=11,face="bold"))+
  scale_fill_manual(name= "County", values=c("Washtenaw"= "lightblue4",
                                             "Wayne"= "cadetblue2"))

gglplot2 Normal Bar Chart

library(tidycensus)
library(tidyverse)

Income<-get_acs(geography = "county", 
                  variables ="B19013_001",
                  state = "mi",
                  county=c("161","163"),
                  year=2017)
library(ggplot2)
library(ggthemes)
library(scales)

ggplot(Income,
       aes(x=NAME, y=estimate, fill=NAME))+
  geom_bar(stat="identity", color= "black", alpha=.5)+
  theme_few()+
  scale_fill_manual(values=c("Washtenaw County, Michigan"= "lightblue4",
                             "Wayne County, Michigan"= "cadetblue2"))+
  theme(legend.position = "bottom")+
  labs(title = "Median Household Income", y= "Income ($)")+
  theme(legend.position="none",
        axis.title.x=element_blank(),
        axis.title.y=element_blank(),
        plot.title = element_text(face = "bold",hjust = ".5", size = "15"),
        axis.text.y = element_text(face="bold"),
        axis.text.x = element_text(face="bold"))+
  scale_y_continuous(labels = dollar)

#Code used for Median Income graph above (excluding data collection):
ggplot(Income,
       aes(x=NAME, y=estimate, fill=NAME))+
  geom_bar(stat="identity", color= "black", alpha=.5)+
  theme_few()+
  scale_fill_manual(values=c("Washtenaw County, Michigan"= "dlightblue4",
                             "Wayne County, Michigan"= "darkturquoise"))+
  theme(legend.position = "bottom")+
  labs(title = "Median Household Income", y= "Income ($)")+
  theme(legend.position="none",
        axis.title.x=element_blank(),
        axis.title.y=element_blank(),
        plot.title = element_text(face = "bold",hjust = ".5", size = "15"),
        axis.text.y = element_text(face="bold"),
        axis.text.x = element_text(face="bold"))+
  scale_y_continuous(labels = dollar)

ggplot2 Density Plot

library(tidycensus)
library(tidyverse)
library(dplyr)

HousingWASH<-get_acs(geography = "tract", 
                     variables = "B25077_001",
                     state = "mi",
                     county="161",
                     year=2017)
HousingWAYNE<-get_acs(geography = "tract",
                      variables = "B25077_001",
                      state = "mi",
                      county="163",
                      year=2017)
HousingWASH$NAME<-'Washtenaw'
HousingWAYNE$NAME<-'Wayne'
BothCounties<-rbind(HousingWASH,HousingWAYNE)
BothCountiesClean<-BothCounties %>% filter(!is.na(estimate))

options(scipen = 999)

library(ggplot2)
library(ggthemes)
library(sitools)

ggplot(BothCountiesClean,
       aes(x=estimate, fill = NAME)) + geom_density(alpha=.5)+
  theme_few()+
  labs(title = "A Tale of Two Counties",
       subtitle = "Home Values, Compared",
       x="Median Home Value ($)")+
  theme(plot.title = element_text(face="bold"))+
  theme(axis.title.x = element_text(size=11,face="bold"))+
  scale_x_log10("Median Home Value ($)", labels=f2si)+
  theme(axis.title.y = element_text(size=11, face="bold"))+
  theme(axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank(),
        legend.position = "bottom")+
  theme(plot.title = element_text(hjust = .5))+
  theme(plot.subtitle = element_text(hjust = .5))+
theme(legend.title = element_text(size=11,face="bold"))+
  scale_fill_manual(name= "County", values=c("Washtenaw"= "lightblue4",
                             "Wayne"= "cadetblue2"))

QGIS

Next, we progressed to geographic data visualization in QGIS. With this software, I was able to experiment with visualizing U.S. Census Data in map form. The counties map below was created by joining U.S Census data to a Michigan counties .shapefile in order to show median home values across the state.

For the second map, I opted to utilize data provided by the San Francisco Mayor’s Office to illustrate the landscape of affordable rental projects overlaid onto the Invest in Neighborhoods initiative project areas. Rental prices in SF have been at the forefront of my mind as I will soon being moving to the area for a summer internship. Beyond my own interest, there are significant implications for examining where affordable rental projects are located throughout the city. Coupled with information about the IIN, which seeks to “strengthen and revitalize neighborhood commercial districts around San Francisco,” these data are provide a powerful starting point to generate conversation about gentrification and community-based economic development.

For more information about the Invest in Neighborhoods Initiative, see http://investsf.org/.

Counties Map

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/MAP.png")
grid.raster(img)

SF Affordable Rentals and Neighborhood Investment

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/sf.png")
grid.raster(img)

Tableau

Finally, we finished the semester by familiarizing ourselves with Tableau, a fairly intuitive drag-and-drop data visualization software that makes it easy to produce beautiful graphs. Again, I dug into U.S. Census data to depict various housing-related characteristics. For this assignment, I chose to compare the rental landscape in California and Michigan at the state level.

Though any of these graphs can stand alone, I think the presentation of all of them together in the Tableau “Dashboard” is particularly compelling, and a format that I could envision being used in an annual report or other professional publication.

Bar Charts

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/BarCharts.png")
grid.raster(img)

Pie Charts

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/PieChartsBIG.png")
grid.raster(img)

Treemaps

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/Treemaps.png")
grid.raster(img)

Full Dashboard

library(png)
library(grid)

img<-readPNG("~/Desktop/Data Viz/FINAL.png")
grid.raster(img)

Moving Forward

Over the course of the semester, I have become fairly comfortable operating R, QGIS, and Tableau. Moving forward, my goal is to develop an even deeper competency in R and QGIS, specifically, in a way that allows me to handle agency and government data with ease. The ability to analyze and depict data in appealing, engaging ways will be a huge asset as I launch my career and involve myself in community-based work. In the meantime, however, I’ll move forward with learning data visualization skills in the spirit of:

“I want to make beautiful things, even if nobody cares.”